Minimal mappings

Minimal mappings are the result of an advanced technique of semantic matching, a technique used in Computer Science to identify information which is semantically related.

Semantic matching has been proposed as a valid solution to the semantic heterogeneity problem, namely supporting diversity in knowledge. Given any two graph-like structures, e.g. classifications, database or XML schemas and ontologies, matching is an operator which identifies those nodes in the two structures which semantically correspond to one another. For example, applied to file systems it can identify that a folder labeled “car” is semantically equivalent to another folder “automobile” because they are synonyms in English.

The proposed technique works on lightweight ontologies, namely tree structures where each node is labeled by a natural language sentence, for example in English. These sentences are translated into a formal logical formula (according to an artificial unambiguous language) codifying the meaning of the node taking into account its position in the graph. For example, in case the folder “car” is under another folder “red” we can say that the meaning of the folder “car” is “red car” in this case. This is translated into the logical formula “red AND car”.

The output of matching is a mapping, namely a set of semantic correspondences between the two graphs. Each mapping element is attached with a semantic relation, for example equivalence. Among all possible mappings, the minimal mapping is a high quality mapping such that i) all the other mapping elements can be computed from the ones in the minimal set in time linear in the size of the input graphs, and ii) none of the mapping elements in the minimal set can be dropped without losing property i).

The main advantage of minimal mappings is that they are the minimal amount of information that needs to be dealt with. Notice that this is a rather important feature as the number of possible mappings can grow up to n*m with n and m the size of the two input ontologies. In particular, minimal mappings become crucial with large ontologies, e.g. DMOZ, where even relatively small subsets of the number of possible mapping elements, potentially millions of them, are unmanageable.

Minimal mappings provide clear usability advantages. Many systems and corresponding interfaces, mostly graphical, have been provided for the management of mappings but all of them hardly scale with the increasing number of nodes, and the resulting visualizations are rather messy [1]. Furthermore, the maintenance of smaller mappings makes the work of the user much easier, faster and less error prone.

Look at [2] for a formal definition of minimal and, dually, redundant mappings, evidence of the fact that the set of minimal mappings always exists and it is unique and an algorithm for computing them.

See also